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Much  of  the  work  on  relational  database^  that  deals  witn  data 


dependencies  makes  a  uniqueness  (of  universal  relation)  assumption 
(dbo]  .  It  has  been  recognized  that  this  assumption  is 
problematic  [BBGJ;  nevertheless  it  is  neccessary  for  the 
axiomatic  approach  taken  in  many  papers  on  the  theory  of 
relational  databases.  We  will  describe  the  problem,  investigate 
some  of  the  solutions  put  forward  and  suggest  a  new  solution. 
Many  of  the  problems  remain  intractable  within  the  realm  of 
"classical"  relational  databases  and  restrictions  must  be  placed 
on  the  use  of  FDs.  An  automated  method  is  presented  that  searches 
for  violations  of  the  uniqueness  assumption. 

1.  INTRODUCTION 

Much  of  the  work  in  database  theory  makes  use  of  a  universal 
relation  assumption.  This  is  particularly  true  for  schema  design 
[ Bt)KN ]  ,  [BbJ  ,  [bbm]  (it  is  also  essential  to  the  concept  of  the 
lossless  join  and  other  interrelationship  concepts) .  It  is 
assumed  that. a  universal  relation  exists  that  contains  all  the 

r 

attributes  of  a  database  and  that  all  the  relations  of  a  database 
are  projections  of  the  universal  relation.  The  assumption  is 
natural  for  schema  design  since  it  means  that  attributes  have  an 
invariant  meaning  over  the  database  and  any  joins  can  be  taken 
without  ruining  the  meaning  of  the  database.  The  assumption  has 
two  consequences;  the  uniqueness  assumption,  i.e.  that  there  can 
be  at  most  one  dependency  from  x  to  y,  where  x  and  Y  are  sets  of 
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a  c  t  r  lout ana  die  jwinability  assui.ipl  i  •_>  n  ,  i.<.-.  ti.at  any  two 

relations  can  be  joined  on  a  common  attribute  [3P1. 

We  will  be  concerned  with  functional  dependencies,  FDs , 
though  the  uniqueness  assumption  applies  to  other  dependencies  as 
well.  Armstrong  [ARM]  presents  an  axiomatic  approach  to  FDs  and 
uses  a  set  of  axiom  schema  to  derive  all  the  FDs  that  follow  from 
a  given  set  of  FDs.  If  G  is  a  given  set  of  FDs,  G+  denotes  the 
set  of  derived  FDs.  The  uniqueness  assumption  must  extend  to  the 
derivations  of  FDs  and  it  can  be  stated  as  follows: 

uniqueness  Assumption:  Let  X  be  a  set  of  attributes  and  A  an 
attribute.  if  x — >AGG+  then  any  derivation  of  X — >A  represents 
the  same  "user  intent". 

The  uniqueness  assumption  means  that  syntactically  identical 
FDs  are  semantically  equivalent.  The  uniqueness  assumption  causes 
a  fundamental  problem  in  that  it  prevents  the  relational  databases 
from  modelling  real  world  situations.  We  shall  see  that  it  is 
sometimes  possible,  and  even  neccessary,  to  have  more  than  one  FD 
between  two  attributes  (or  entities).  This  has  been  noticed 
([up],  (SSJ),  and  the  solutions  put  forth  either  violate  the 
atomic  nature  of  attributes  (INF)  or  put  artificial  restrictions 
on  the  derivation  of  FDs. 

Tnere  is  also  the  problem  of  verifying  the  uniqueness 
assumption.  Let  us  examine  how  FDs  are  originated.  If  the 
fundamental  construct  in  defining  a  relational  database  is  the 
relation,  then  the  users  supply  Keys  with  the  schema  (which  is 
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j specifying  sone  IDs).  Tae  users  would  then  be  asked  to 
“find"  additional  intended  FDs .  In  fact,  Bernstein  [BERN]  takes 
the  view  tnat  fDs  are  the  fundamental  construct  and  that  the  users 
should  be  asked  to  supply  all  "intended"  FDs  and  synthesize  the 
relations  from  them. 

Beeri  and  Bernstein  [BB]  have  developed  a  fast  algorithm  for 
synthesizing  relational  database  schema  in  the  3rd  normal  form 
from  a  given  set  of  FDs  such  that  the  resulting  schema  embodies 
tae  original  FD's.  Tnere  algorithm  uses  the  uniqueness 
assumption. 


When  Been  and  Bernstein's  algorithm  is  used  to  synthesize 
relational  database  schema  an  attempt  should  be  make  to  verify  the 
uniqueness  assumption.  This  would  be  done  in  two  steps;  first  by 
having  the  users  go  over  all  the  initial  FDs  making  sure  that  all 
syntactically  identical  FDs  have  the  same  semantic  (user)  intent. 
The  first  step  may  cause  the  database  administrator  to  rename  and 
add  attributes  and  only  after  he  “finalizes"  the  attributes  and 
FDs  could  the  relations  be  synthesized.  Then,  since  the  first 
step  does  not  guarantee  that  all  derived  FDs  will  satisfy  the 
uniqueness  assumption,  the  users  would  have  to  “decide”  whether 

all  derivations  (using  Armstrong's  axioms)  of  an  FD  X  - >  ¥  have 

tbe  same  user  intent.  Though  Beeri  and  Bernstein's  algorithm  is 
linear  any  attempt  to  verify  tae  uniqueness  assumption  is  doomed 
to  exponential  time  (counting  each  human  decision  as  one  unit). 


...i  automatic  checking  for  viol  at  i .  t  t:.e  nieness 

assumption  is  preferable  to  interactively  “showing-  each 
derivation  to  the  user.  Such  a  semantic  analyzer  is  difficult  to 
find  since  it  is  not  known  how  to  formalize  the  “user  intent"  of 
an  FD.  As  a  partial  solution  we  classify  FDs  into  three  tynes, 
regular,  injective  and  computable.  Armstrong's  [AR*M  axioms  can 
ue  applied  to  these  types  so  that  every  derivation  of  an  FD  will 
result  in  classifying  tne  FD  as  one  of  the  three  types  (given  the 
types  of  tne  initial  FDs).  When  two  derivations  of  an  FD  result 
in  two  different  classifications  then  we  have  a  violation.  If  two 
derivations  both  result  in  a  computable  FD  it  is  sometimes 
possible  to  decide  that  the  computations  are  different  and  there 
is  a  violation.  In  other  cases  it  would  not  be  known  if  there  was 
a  violation. 

The  usual  solution  to  a  violation  of  uniqueness  is  to  rename 
some  attributes.  This  can  cause  a  multiplicity  of  attribute  names 
and  in  addition  may  lead  to  "difficulties"  so  that  sometimes 
certain  derivations  must  be  "outlawed.” 


2.  Definition  and  Prel  iminar  ies 

Tnere  is  much  diversity  in  notation  for  relational  databases 
and  we  will  primarily  follow  Fagin  [FAG]  not  going  into  too  much 
detail.  Our  view  of  relational  databases  is  somewhat  similar  to 
that  of  Cadiou  [CAD]  and  Nicolas  [NIC].  Let  X  be  a  finite  set  of 

J 

attributes;  an  X*tuple  is  a  function  with  domain  X  (associating 
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wit:;  each  attribute  a  val  ut .  If  YLX  at;!  t  ir  r.r,  X- tuple  then 
t  [  Y  ]  denotes  the  Y-tuple  obtained  by  restricting  the  napping  to  Y. 
There  are  two  notions  of  a  relation;  intension  and  extension. 
The  extension  of  a  relation  over  the  attributes  X,  or  sinply  an  X- 
I  ^relation  is  a  finite  set  of  X-  tuples.  If  h  is  an  X-  relation  and 
Y  X ,  then  R[Y],  the  projection  of  R  onto  Y ,  is  defined  by; 
R[Y]  =  {t[Y)  ;  tt«}. 

Trie  intention  of  a  relation  includes  a  set  of  attributes  and 
as  much  of  the  "user  intent" ,  in  the  form  of  constraints,  as 
possible.  The  intention  of  a  relation  consists  of: 

1.  A  relational  form  made  up  of  a  relation  name,  R,  and  a 
set  of  attributes  X,  usually  written  R(A,,...,A  ),  where 
X= ( A^ , • • • • An) 

2.  A  set  of  keys  (which  is  a  partial  listing  of  the  FDs) 

3.  Functional  dependencies  and  Other  types  of  dependencies 

4.  Domain  definitions  of  the  attributes 

5.  Other  integrity  constraints  (See  Eswarian  [ES]  and  Hammer 
ftcleod  [HM]  for  a  taxonomy) . 


1.  and  2.  arc  usually  called  a  relation  scheme  and  in  tieeri  and 
Bernstein  [BB]  ,  while  in  [FAG]  the  entire  intension  is  called  the 
relation  scheme. 


-TP*- 
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For  each  intension  R  there 
extension  (or  instance)  of  R  is 
satisfying  the  constraints  of 
cirferentiate  between  intension  and 
intension. 


are  many  extensions.  Each 
a  finite  set,  R ,  of  X-tuples 
the  intention.  Thus  we 
extension  by  underscoring  the 


Tne  intension  of  a  database  is  a  finite  collection  of 
relational  intensions  witti  additional  integrity  constraints  (that 
include  more  tnan  one  relation) .  by  the  uniqueness  assumption 
attributes  and  their  domain  definitions  are  invariant  over  the 
database'  so  the  FDs  (and  other  dependencies)  can  be  considered  to 
reside  in  the  database  as  a  whole. 


The  constraints  on  a  database  can  be  stated  in  any 
appropriate  language  such  as:  first  order  predicate  logic, 
SEQUEL,  QUERY  BY  EXAMPLE.  It  is  possible  to  discuss  the  set  of 
all  extensions  of  a  database  (which  is  infinite  in  general)  but 
many  questions  (consistency,  der ivabil ity)  may  be  undecidable. 
For  details  consult  Gallaire  and  Minker  [GMJ  and  Nicolas  [NIC] . 

The  constraints  for  which  the  above  questions  are  important 
are  those  that  affect  the  structure  of  the  database.  FDs  and 
otner  dependencies  affect  ttie  relational  database  schema  since  the 
normal  forms  are  stated  in  terms  of  the  FDs.  Fortunately,  under 
tne  uniqueness  assumption,  questions  of  consistency  and 
derivability  about  FDs  are  decidable. 


H 


1  j  is  citnoit  i  x  -  -  >  Y  , 
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are  sets  of 


attributes.  Tne  only  information  t rn t  the  .above  notation  imparts 
is  that  tor  any  relation  R  whose  attribute?  include  XUY  and  for 
any  instance  R  o f  R,  if  two  tuples  coincide  on  X  they  must  also 
coincide  on  Y.  In  other  words  For  any  instance,  R ,  of  a  relation 

containing  the  attributes  X  Y,  [<t[X],t[Y]>  :  t€  R)  is  a  finite 

partial  function  from  dom(x)  to  dom(Y). 

Formally,  \ft&  K  \/stK  (  ( t  [XJ  =  s  1  x  J  ->  (t[  Y]  =  s[Y] ) )  .  Sometimes  f:X — >Y 
is  written,  where  f  denotes  a  canonical  name  for  the  partial 
function  from  dom(X)  to  dom(Y)  which  is  dependent  on  the  extension 
(and  changes  as  the  extension  does).  If  £:X — >Y  and  R  is  an 

instance  of  R  then  f  R,  the  realization  of  f  in  R,  is  the  finite 

artial  function  above.  Note  that  the  uniqueness  assumption  means 
that  function  from  dom(X)  to  dom(Y)  is  invariant  over  any  relation 
containig  those  attributes  in  any  given  instance  of  the  database. 


Armstrong  [ARM]  gave  a  set  of  axiom  schema  for  derivinq  FDs 
from  a  given  set  of  FDs  and  shows  that  the  system  is  sound  and 
complete.  Be».ri  and  Bernstein  [BB]  use  the  following  equivalent 
axioms. 

A^:  (Kef lexivity)  X — >X 

A^:  (Augmentation)  If  x — >Z  tiien  XU  Y  —  >'L 

Aj :  (pseudotransiv ity)  If  x — >Y  and  YyJZ  —  >W  then  XU  2  —  >t ft. 
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j .  C.*ior ni  ng  ana  Correcting  the  Unicu<*n»ss  '.'-s'nptien 

It  G  is  a  set  of  FDs,  then  C.+  5s  the  closure  of  G  under  the 
I'.tcve  axioms.  An  important  part  of  Beeri  and  Bernstein's 
al.joritnm  is  to  decide  whether  a  gived  FD  lies  in  G+.  If  X — >Y 
can  be  derived  from  G  it  can  be  derived  by  an  infinite  number  of 
derivations.  By  the  uniqueness  assumption  Beeri  and  Bernstein  can 
assurae  any  derivation  of  X  —  >¥  represents  a  unique  "user  intent." 
Bence  tney  need  only  search  for  one  such  derivation.  They  only 
nave  to  searen  derivation  trees  of  heiyht  at  most  the  number  of 
attributes  among  tne  G  since  a  derivation  with  a  loop  (i.e.  one 
that  goes  through  an  attribute  twice)  is  the  same  without  the  loop 
since  X — >X  must  be  the  identity  mapping  by  uniqueness. 

The  scenario  that  Beeri  envisages  IBP]  is  that  first  the 
database  administrator  checks  that  the  uniqueness  assumption  is 
not  violated  by  consulting  with  the  users  and  then  after  the 
attributes  and  FDs  are  finally  set  the  linear  algorithm  for 
creating  the  relational  database  schema  in  3NF  can  be  used.  The 
correctness  of  the  uniqueness  assumption  must  be  a  matter  of 
belief  since  any  method  for  verifying  the  uniqueness  assumption 
will  involve  comparing  different  derivations  of  a  single  FD.  By 
tne  above  method,  it  X — >Y  is  not  unique  there  are  two  derivations 
of  x — >i  by  trees  of  at  most  height  twice  tne  number  of  attributes 
(since  at  most  one  loop  is  needed) .  Since  we  would  have  to  search 
all  derivations  the  solution  is  at  best  exponential.  When  a 
violation  of  uniqueness  is  discovered  the  usual  remedy  is  to 
change  attribute  names  so  that  two  different  FDs  are  produced. 


.... 
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'i.-.as  '..'jjl;  c ;  ;S  ■ :  q  e  the  final  relational  database  schc-"'.?  -vithf-si  ?<?c 

by  the  Beeri  and  Bernstein  algorithm. 

Primitively  the  “user"  could  me  used  as  an  oracle  to  decide 
wnetner  derivations  are  unique.  If  a  violation  is  found 
attributes  can  be  named  but  it  would  produce  more  derivations  and 
possioly  violations.  It  is  not  ciear  that  such  a  process 
terminates  by  some  given  bound  (an  example  of  tnis  will  be 
discussed  later)  .  Beeri  and  Bernstein  suggest  an  alternative 
sol ution— simply  reject  some  inferences.  We  shall  see  that  this 
may  be  necessary. 

4.  Classification  and  Discussion  o f  Functional  Dependenc ies 

In  this  section  we  will  classify  FDs  and  discuss  the  problems 
they  cause  visi  vie  the  uniqueness  assumption.  We  will  work 
within  the  universal  relation  assumption. 

4.1  Classification 


Regular  FDsiRegular  FDs  are  those  that  have  no  additional  semantic 
meaning  besides  wnat  is  demanded  in  the  definition  of  an  FD,  i.e. 
the  realized  function  of  X — >H  can  be  any  finite  partial  function 
from  dom(X)  to  dom(¥). 

Examples:  EMP— >DEPT ,  DEPT — >MGR . 

Inj ective  FDs:  Injective  FDs  have  the  extra  restriction  that  the 


realized  functions  must  be  one  to  one.  i.e. 
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I:  a  —  >  V  ] i.’.jTrtive  the: 

V  tt  R  l/'st  H  (  ( t  [X  J  =  s  IX]  )  <  —  >  (t[Y]*s[Yl))  We  denote  such  an  FD  bv 

a  <  —  >  V  . 

~  X  c:  l  ;  p  1  C  £>  J  o  d  S  >  C  '  .  i'  . 

Computable  1  US  I 


An  i-u  f : a — > ¥  is  computable  if  the  t[Y]  can  be  computed  from 
t[Xj  (In  the  language  that  the  database  constraints  are  stated), 
cne  database  instance  and  the  database  constraints.  we  shall 
denote  by  F:X — >Y ,  where  F  is  upper  case  and  represents  the  "real" 
function  involved  (stated  in  the  proper  language) .  We  reserve 
capitals  for  computable  functions. 

Examples : 

A.  F:SALARY,  NUMBER. OF. DEP — >WITHOLDING .TAX 
This  is  an  example  of  direct  computation  from 
t[SALARY, NUMBER. OFDEP] . 

ti.  It  A  and  B  are  attributes  we  may  have  A+B=K  a  constant  and 
G:A — >B  wnere  G=K-A 

Here  tne  computable  ID  is  one  to  one,  but  the  computability  is  the 
more  essential  property. 

C.  H:  DEPT  - >NUM8£rt. OF. EMP  This  FD  is  dependent  on  a  column 

in  the  particular  instance  of  the  database. 

The  algorithm  for  H  is  to  count  the  number  of  employees  in  the 
department  for  each  instance. 

Later  we  will  show  an  example  where  FDs  are  computable  from  other 
FDs,  the  database  instance  and  tlXj. 
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4.2  -pi  m::.-'  At t  r  1  Mite  l- 

In  many  cases  the  uniqueness  assumption  is  salvaged  by 
Mspiitting“  an  attribute  into  two  distinct  ones.  As  an  example, 

[opj  consider  f:EMP - >  DEPT,  g:EMP —  >  FLOOR  anu  n:0£Pi  —  >FLOOR, 

with  the  semantic  meanings,  g(EMP)  is  where  the  employee  wori;s  and 
n(OEPT)  is  where  the  main  department  office  is  located.  Using 
transitivity,  we  get  h  f:EWP.  — *>  FLOOR  which  has  a  different 
semantic  meaning  than  g,  violating  the  uniqueness  assumption.  The 
dacabase  administrator  splits  FLOOR  into  two  attributes,  EMP. FLOOR 
and  DEPT. FLOOR.  For  some  purposes  it  will  still  remain  natural  to 
treat  FLOOR  as  a  single  attribute,  as  in 

FLOOR — >VOLUME. OF. FLOOR,  FLOOR — >NUMBER . OF. WINDOWS . ON . FLOOR ,  etc. 
Because  of  splitting  the  extra  FDs 

EMP. FLOOR - >VOLUME. OF. FLOOR  and  DEPT. FLOOR >VOLUMEOF. FLOOR  must 

be  added.  In  addition  the  attribute  VOLUME. OF. FLOOR  must  be  split 
into  VOLUME. OF. EMP. FLOOR  and  VOLUME. OF. DEPT . FLOOR  for  the  same 
reasons  as  aoove.  To  show  where  this  can  lead  take  the  FDs, 

SS* — >HOrt£. ADDRtSS  and  SS* - >BUSINESS .ADDRESS  and  the  chain  of 

natural  FDs  ADDRESS — >ZIPCODE — >STATE — >GOVERNOK — >PARTY .  This 
could  lead  to  an  attribute,  HOME. ZIP. STATE. GOVERNOR. PARTY  (though 
HOME. PARTY  would  do  but  its  meaning  would  be  obscure) .  This  train 
can  indeed  be  very  long  causing  an  enormous  proliferation  of 
attributes.  Note  that  the  split  attribute  names  actually  impart 
the  path  taken  to  them  and  perhaps  this  should  be  a  hint  as  to  the 
direction  research  should  follow. 


I’Tl,'  1  ? 

it  h  ana  br.utn  t-oi  w  tm  :  r  *  :  v  .  -  ~..i  ■  r .  •  -  r  i  • 

attribute  tor  aaurcss,out  in  do  in.;  s  •.«  they  lf.’vc  the  rc^i’-i  of 
"Uat  uatdbases"  since  aauross  io  no  longer  atonic  and  the 
relation  woula  not  be  in  INF.  Tne  vast  majority  ol  researcliers  in 
relational  oatabases  assume  the  IN t  aim  witnout  it  both  the 
hierarchical  and  network  oatabases  can  be  formulation  in  the 
relational  torrn  [ja] . 

We  have  seen  two  causes  for  splitting.  In  the  case  of 

EMP — ->FLOOR  and  DEPT - >FLOOR  the  attribute  FLOOR  plays  two  roles 

and  formally  must  be  treated  as  two  distinct  attributes.  The 

functional  relationship,  SS# >ADDRESS  is  no  a  multivalued 

relationship,  but  rather  should  be  represented  as  two  different 
FDs .  Since  both  FDs  have  different  "semantical-  ranges  which  are 
subsets  of  the  original  meaning  of  the  domain  of  ADDRESS,  the 
uniqueness  assumption  requires  that  we  split  ADDRESS  into 
HOME. ADDRESS  and  BUS INESS .ADDRESS .  Thus  we  see  that  the 
definition  of  an  fc'D  t:A — contains  an  implicit  assumption  that 
the  "semantical"  range  of  I  and  the  dom()f)  are  the  same.  The 
Smith  and  Smith  solution  (of  retainig  the  generic  attribute 
ADDRESS)  has  a  drawback  (for  relational  database  formalists),  in 
that  we  must  be  able  to  go  from  ADDRESS  to  HOME. ADDRESS  and 
BUSINESS. ADDRESS  and  back.  Even  though  the  connection  between 
ADDRESS  and  HOME. ADDRESS  is  the  identity  and  hence  injective  it 
clearly  violates  the  uniqueness  assumption.  We  will  take  up  the 
question  when  we  discuss  injective  FDs. 
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Injective  r^s  present  ft  more  serious  pruMen  to  the 

to i  nidi  ism .  Consider  the  cintbicel  v/ui  pie  o£  1  :  L. - >SGR  and 

g:MGK< - >EMP ,  where  g(MGK)  means  the  managers  employee  number. 

This  gives  us  two  distinct  CDS  Irom  EmP-->EMP  (one  the  identity 
tne  other  gf:EMP — >EMP)  .  This  can  he  temporarily  solved  by 

changing  g  to  g:MGR — >MGR.EMP.  Unfortunately  this  leads  to 
additional  problems.  Assume  we  have  a  hierarchy  of  managers 
(manager  of  managers,  etc.);  how  would  the  manager  of  a  manager 
be  determined.  If  the  manager  is  treated  as  a  regular  employee  in 
EMP — >MGR  an  FD  EMP.OF.MGR< — >EMP  is  needed  which  again  causes  a 

violation.  Otherwise  a  EMP. OF. MGR >MGR. OF . MGR  is  needed,  and  so 

on  until  the  highest  manager.  This  is  analogous  to  a  geneology 
database  with  a  Son,  Father  relation.  In  such  a  case  attributes 
for  Grandfather,  Greatgrandfather,  etc.  until  Adam.  With  such  a 
procedure  it  is  possiole  to  create  an  infinite  sequence  of 

attributes.  The  only  feasible  solution  is  to  outlaw  problematic 
derivations  and  consider  MGK-of-MGR  etc.  as  computable. 

If  f:X< — >Y  is  an  isomorphism  then  attributes  X  and  Y 
represent  different  aspects  of  the  same  "entity".  If  f:XO->Y  is 
injective  and  the  “semantic"  range  of  f  is  a  subset  of  dom(Y) 
then,  if  we  wish  to  conform  with  the  above  implicit  assumption,  we 
must  split  the  attribute  Y  so  that  the  resulting  FD  is  onto 
(surjective).  But  we  have  seen  that  splitting  does  not  work  for 

f :MGR< - >EMP,  since  it  would  require  and  infinite  sequence  of 

splitting.  Tne  semantic  meaning  that  f:MGK< - >EMP  is  meant  to 
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true  even  if  £  is  nut  cue  luentity 
a  sunset  of  dom ( EmP)  via  f  in  any 
formalism  such  FDs  must  treated  in 


4.4  Computable  FDs 

One  may  ask  if  computable  attributes  should  be  in  a  database 
altogether.  The  answer  is  that  in  general  they  should  not.  If 
the  computation  is  cheap  it  can  be  recalculated  every  time  there 
is  a  query.  Even  if  not  the  attribute  should  be  virtual  in  the 
following  sense: 

1.  The  attribute  should  be  attached  to  an  appropriate 
relation-  but  not  be  considered  in  the  relational  schema 
and  should  not  take  part  in  derivations  of  FDs  or 
decisions  about  the  various  normal  forms. 


2.  Every 

time  an 

update  is  made  that 

affects 

the 

val  ue 

of 

the 

virtual 

attribute  in  a 

tuple , 

it 

should 

be 

recalculated . 

3.  It  should  be  included  among  the  attributes  for  queries. 

Thus  computable  attributes  would  not  cause  any  anomalies  and 
need  not  be  considered  for  the  construction  of  normal  forms.  Of 

course  if  r : A >u  was  computable  and  a  user  defined  an  FD  B - >C, 

the  uatauabc  manager  would  have  to  include  A - >C  in  the  set  of 
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itself  computable).  Tne  only  problem  trus  woula  present  is  for 

injective  computations  F:  A — >d  and  F  1:& — >A  (as  is  tne  case  for 
A+3*K)  .  Then  we  woulu  have  to  decide  which  was  more  "basic*'  and 

this  may  not  be  known  by  the  user.  Hence  in  such  cases  it  is 

better  to  leave  them  in  and  consider  them  for  the  normal  forms. 
It  seems  that  in  practice  computable  FDs  are  included  among  the 
attributes  of  databases  even  if  they  are  not  one-to-one.  This  is 
the  case  for  some  of  the  examples  in  Beeri  and  Bernstein  [21. 

There  are  computable  FDs  that  depend  on  other  FDs,  the 

database  instance  and  t[X].  take  the  FDs  H:DEPT - >NUMBER . OF. EMP 

and  g :DEPT - >MGR,  where  one  manager  may  manage  more  than  one 

department.  toe  can  define  a  computable  G:MGR - >NUMBER. OF. EMP 

that  aepenas  on  H  and  g,  by  adding  up  the  H(dept)  such  that 

g(dept)=mgr  for  a  given  mgr  dom(MGR).  Later  we  develop  a  syntax 

for  handling  such  computations.  Given  the  FD  f : PERSON — >FATHER , 
the  grandfather,  greatgrandfather  etc.  of  a  person  becomes 
computable  from  f  by  f(f(p)>,  f(f(£(p)))  etc.  This  assumes  that 
dom (FATHER) £  dom (PERSON)  which  causes  an  intrinsic  problem  to  the 


FD  formalism 
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Beer  x  [BPj  sujjcttea  that  MrD"s  tiu.t  represent  a  subset 

constraint,  as  I\Gk - >c-u‘,  s;.ju1j  not  m  treated  as  an  FD. 

Instead  we  allow  tins  functional  relationship  to  be  expressed  by  a 
subset  constraint,  i.e.  MGh  Z-SP.  Note  that  A  B  does  not 
necessarily  mean  that  non  { A)  \  cion  ( b)  ,  because  the  MGR  ID  does  not 
have  to  be  equal  to  the  Planners  EMP  ID  (the  EMPs  may  six  diqit 
numbers  and  the  MGR  s  may  be  five  diqit,  for  instance).  At.  8 
means  that  the  functional  relationship  between  dom(A)  and  dom(B) 
is  an  injection  (which  is  often  the  identity)  and  any  attribute 
that  B  has  A  also  has,  i.e.  A  and  B  represent  the  same  type  of 
entity  and  they  uoth  have  common  properties  (though  A  may  specific 
attributes  not  related  to  8) . 

It  is  also  possible  to  have  two  attributes  that  represent  the 
same  type  of  entity  but  neither  is  a  subset  of  the  other.  Beeri 
calls  these  compatible  attributes  and  we  denote  this  constraint  by 
COMP ( A, B) .  As  an  example  take  a  parole  officer,  PO,  that 

determines  a  social  worker,  SW.  If  we  treated  PO — >SW  as  an  FD  we 

would  obviously  have  the  same  problems  as  before  since  both  are 
subsets  of  Person  and  both  would  have  salaries  deriving  to  FDs 

from  PO - > SALARY .  Instead  we  describe  the  relationship  by 

PO - JSW  and  COMP(PO,SW).  We  need  the  PO }SW  since  COMP(PO,RW) 

merely  states  that  the  attributes  are  compatible  and  does  not  give 

any  functional  direction.  We  may  think  of  - }  as  a  “dead  end" 

FD,  since  it  will  not  play  a  role  in  the  derivations.  Note  that 
B  implies  that  COMP(A,fa)  and  A - jb.  In  most  cases  we  will 


1  :i  g  e  17 


hove  CD  \r  (A,  b  >  wnen  there  is  a  third  attribute  C  sue.  that  ftr  r 
and  C. 

The  difficulties  that  develop  with  injective  and  compatible 
Fi)S  between  attributes  belonging  to  the  same  type  of  entity  can  be 

solved  by  using  the  "dead  end"  FDs,  A - }B.  Functional  relation 

of  the  form  A - }B  must  not  be  used  in  pseudo- transi tivity  (which 

analogous  to  the  natural  join).  If  A - }Bf  B  may  end  up  in  a 

relation  where  A  is  the  key,  but  will  not  cause  any  anomalies 
since  notning  in  the  relation  will  be  dependent  on  b.  So,  for  the 
relation  containing  A  in  a  key,  nothing  is  transitively  dependent 

on  B  and  B  cannot  be  an  essential  part  of  any  Key.  If  A - }B  is 

tne  identity  then  B  could  be  left  out  completely,  as  in  the  case 
of  computable  FDs,  and  only  be  used  for  queries.  Later  we  will 
discuss  the  join  on  compatible  attributes  for  queries. 

The  proliferation  of  attribute  names,  caused  by  splitting, 
can  be  alleviated  by  the  above  method.  We  can  have 

EMP — >EMP. FLOOR,  DEPT - >DEPT. FLOOR  and 

DEPT. FLOOR,  EMP . FLOOR £  FLOOR.  Since  this  implies 

DEPT. FLOOR - JFLOOR,  there  would  be  no  need  for  continued 

splitting  as  in  the  case  in  the  section  on  splitting.  Though  the 
classification  of  a  subset  constraint  depends  on  the  "judgment"  of 
the  database  administrator,  in  any  conceivable  case  where  B  must 
be  split  into  two  attributes,  B1  and  B2,  because  there  are  two 
distinct  FDs  from  A—  >B,  we  will  have  Bl,  B2£B. 
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being  tne  same  type  of  entity)  will  not  appear  in  anv  relation 
with  A  in  a  key  and  will  only  no  acrcrse-:  v:»  nueriep  (that  will 
have  extra  joining  ability  witn  coi-vei.  t  ions  to  distinguish 
attributes  reached  by  uiiferent  psti.s)  .  Cn  tne  other  hand  if  A 
nas  properties  specific  for  A  (not  for  t)  then  we  may  have  FDs  of 
tne  form  A - >C  tor  tnose  attributes. 

Let  us  examine  a  small  database  as  an  example.  Let  us  assume 
tnat  we  are  dealing  with  a  company  that  has  employees,  managers 
and  only  managers  nave  assistants  and  company  cars  and  the  FDs 
concerning  FLOOKs  that  we  discussed  earlier.  The  following  FDs 
are  evident;  EMP >MGR ,  EMP >DEPT ,  EMP >SALARY .  As 


mentioned  above  the  relationship  between  EMP  and  the  FLOOR  he 

works  on  must  be  of  the  form,  EMP - >EMP. FLOOR,  since  we  also  have 

DEPT - >DEPT. FLOOR.  We  thus  get  EMP. FLOOR,  DEPT. FLOOR  £  FLOOR  and 

the  FD  FLOOR - >VOLUME.  We  also  have  MGR  EMP  and  ASST  EMP, 

hence  COMP (MGR, ASST)  and  we  must  write  MGR - }ASST.  On  the  other 

hand  we  can  use  the  FD  MGR - >CAR  as  a  regular  FD,  since  they  are 

not  compatible  and  only  managers  have  cars.  For  the  purposes  of 

inser t , delete  and  update  you  could  not  have  SALARY  in  relation 

that  contained  mgk  in  its  key.  Only  a  query  could  construct  the 
necessary  joins. 

The  practical  outcome  of  such  an  approach  would  be  to  shorten 

the  derivation  paths  used  for  FDs,  since  paths  are  cut  off  by 

- }s.  But  tnis  is  realistic  since  long  derivations  are  probably 
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paths  are  now  put  onto  the  query  language  and  there  must  he  some 
uniform  system  to  name  attributes  on  joins  over  compatible  tvres. 

Let  us  examine  how  the  above  solution  would  behave  with  a 
formal  treatment  of  relational  databases.  Let  us  assume  that  the 
schema  are  syntnesized  using  Bernstein's  method.  With  the  help  of 
tne  users  the  database  administrator  nas  to  decide  which 

attributes  are  subsets  of  other  attributes  or  compatible 
atcriDutes.  In  particular  he  will  nave  to  decide  which  functional 

relationships  are  of  the  form  A - }B.  Then  under  the  "belief" 

that  the  remaining  FDs  (i.e.  without  the  - }  type)  are  cleansed 

from  inconsistencies,  the  synthesizing  algorithm  can  be  preformed 
on  the  FDs.  Next  each  A— )B  should  be  examined;  If  the 
relationship  is  the  identity,  then  it  need  not  appear  in  the 
schema  and  must  only  be  known  by  the  query  language  (since  it  is 

computable).  On  the  other  hand  if  A - )B  is  not  the  identity,  B 

must  appear  in  a  relation  where  A  is  the  key.  Note  that  we  are 
assuming  that  A  is  a  single  attribute,  since  we  do  not  consider  a 
pair  of  attributes  forming  a  compatible  entity.  If  A  already  is  a 
Key  in  the  synthesized  schema,  B  may  be  added  to  the  appropriate 
relation  (this  must  be  "remembered"  for  the  query  language), 
otherwise  a  special  relation  of  the  form  R (A, B)  is  formed.  B 
cannot  appear  in  a  relation  where  A  is  part  of  a  key  since  B  would 
not  be  fully  dependent  on  the  key  causing  the  common  anomalies. 


tui  the  sj  t.t*  o',  til  iciency  it  ;:.ay  be*  useful  to  live  pone 

priority  to  attributes  A,  where  A - }B,  when  meninq  enuivalent 

keys  in  the  synthesis  algorithm.  This  saves  unercessa r i 1 y  addino 
relations  (in  [Bb]  the  number  of  relations  are  minimal). 

been  [BP]  suggests  that  the  query  language  be  allowed  to 
join  on  compatible  attributes.  In  such  joins,  new  attributes  are 
created  ana  they  must  be  named  in  a  convention  that  somehow 

specifies  the  query  path.  It  we  have  A }B,  k (A, b, . . . )  and 

S(B,C,...)  the  query  language  may  preform  a  join  on  B.  But  the 
attributes  appearing  in  S  must  be  renamed  to  reflect  that  they 
came  from  A,  i.e.  the  joined  relation  should  look  something  like 
KS (A, B, A.C, . . . )  .  Of  course  the  B  in  R  must  somehow  be  singled  out 
so  that  the  query  language  knows  how  to  preform  the  joins  and 
renaming.  This  procedure  creates  a  potential  infinite  sequence  of 
attributes  (as  in  father,  grandfather,  greatgrandfather  etc.). 

6.  Di f erentiating  Between  FD  Derivations 

The  above  solution  terminates  many  derivation  paths  (i.e. 

join  paths)  for  FDs  by  changing  many  to  the  form  of  A - }B.  The 

resulting  system  still  could  (and  probably  does  if  it  is  large) 
contain  violations  of  tne  uniqueness  assumption.  Derivation  paths 
come  into  full  view  in  the  query  language.  Under  any  circumstance 
detection  of  different  functional  relations  between  the  same  same 
entities  (if  not  attributes)  is  of  importance.  It  would  also  be 
of  interest  to  see  which  derivations  can  be  equated  (see  [bun]). 
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Classifying  FDs  into  regular ,  injective  and  computable  can  he 
usee,  as  a  first  step,  in  aetectina  different  FDs  between  the  same 
attributes.  The  program  would  be  as  follows: 

i lie  database  administrator  classifies  the  FDs  and  compatible 
attributes  to  cue  best  of  nis  knowledge.  Regular  FDs  are  written 

A - >b,  injective  FDs  A< - >B  and  computable  F:A - >B,  where  F  is 

a  pointer  to  tne  computation.  Armstrong's  axioms  are  adapted  to 
include  tne  FD  ciassi t icutions .  Using  the  axioms  FDs  can  be 
derived  and  when  more  tnan  one  derivation  for  a  FD  from  A  to  B  is 
found  they  can  be  compared.  If  they  are  of  different  types  then 
there  is  a  definite  violation  (which  can  be  amended).  Otherwise 
if  they  are  both  computable  then  in  most  cases  it  will  be  possible 
to  see  if  the  computations  are  equivalent  (in  general  the  question 
of  equivalence  of  two  computations  is  undecided  for  a 
sophisticated  enough  language) .  If  both  are  either  regular  of 
injective  then  the  question  of  uniqueness  remains  unanswered.  In 
the  future  it  may  be  possible  to  make  finer  classifications 
combined  with  equivalence  classes  of  derivations,  improving  the 
detection  process. 

Tne  above  process  is  unfortunately  exponential  (though  we 
nave  mentioned  that  a  derivation  tree  of  at  most  height  two  times 
tne  number  of  attributes  would  be  needed  to  check  for  violations) , 
but  it  is  at  least  an  automated  process  to  search  for  violations. 
The  adapted  Armstrong  axioms  can  be  written  as  follows: 

a.  X< — >X 

Hence  the  identity  is  not  considered  computable. 


b.  if  \< — >Y  then  Y<-->X 

A^  a.  if  X — >Z  (or  X< — >Z)  then  XOY-- >Z 

b.  if  F :  X — >Z  tnen  MWV  —  >Z  where  t*(X,Y)=F(X) 

Aj  a.  if  [  (X — >  Y  ano  iUZ  —  >«)  or 
(X< — > Y  and  Yu  Z — >w)  or 
(X — >Y  and  K  dZ<  —  >W)J  then 
XU  z  —  >w 

b.  if  X< — > Y  and  YvZ< —  >W  then  Xb>Z< — >W 

c.  if  F:X— >Y  and  (Yu/  Z  —  >W  or  VUZ< — >W)  then  F*:XUZ —  >w 
where  if  f  is  the  canonical  name  for  Yu/  Z~  >W  (Y  Z<— >W) 
then  F*  (X,  Z)  =  f  ( F  (X )  ,Z)  . 

d.  if  (X — >Y  or  X< — >Y )  and  F:YJZ — >W  then  F*:XJZ~ >W 
where  if  f  is  the  canonical  name  for  X — >Y  (X< — >Y )  then 
F* (X,Z)«F(f (X) ,Z) 

e.  if  Fi:X — >  Y  and  F2:Y  Z  — >W  then  F*:XUZ->W  where 

*  * 

F  (X,  Z)  aF^  (F^  (X)  ,  Z)  ,unless  Z=(J  and  F  is  the  identity 

function  in  which  case  we  have  X< - >X. 

Notes: 

* 

1.  In  c.  and  d.  F  is  computable  in  f,  which  is  regular  (or 

* 

injective).  F  is  not  fully  computable,  since  it  depends  on  f 

0 


'  1  ‘ 


which  c  the  oat  c  ei-e  ir.rt<.?.rt  .  1  or  ='.'2,  in  r  ^  ^ - 

tv  computable  t  Ds  ,  is  to  equate  or  J  i i  f  e r  en  1 1  s  te  bt  tv.-eor.  then  and 
tv  u i t ferentia  te  then  iron  non-  computable  IDs.  This  goal  can 

x 

sometimes  be  accomplished  ay  ci^nuiiivj  F  a_  co.  <put.uui  e  as  we  snail 

see  later. 

l.  The  identity  function  is  not  considered  computable,  since 
otnerwise  any  t'D  would  be  computable  by  c.  • 

* 

3.  In  we  did  not  want  F  to  be  declared  computable  if  it  was 
the  identity  function.  We  shall  assume  that  it  is  possible  to 

4r 

decide  that  F  is  the  identity,  though  theoretr icall y  there  are 
pathological  cases  where  this  could  not  be  decided.  The  detection 

of  the  identity  would  normally  occur  where  F : A - >B  is  an 

injection  and  we  also  have  F  1  (and  the  computable  attributes  were 
retained  as  mentioned  above) . 

We  snail  illustrate  how  the  above  classification  of  FDs  can 
be  used  to  automatically  detect  violations  (that  may  be  missed  by 
the  database  administrator)  by  using  the  some  examples  given  in 
Beeri  and  Bernstein  [BB].  In  order  to  define  computable  functions 
we  need  a  function  manipulation  language  (we  cannot  use  relations 
since  we  are  only  given  FDs).  we  will  develop  only  sufficient 
tools  to  illustrate  our  examples  intuitively. 
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a  c  i  c  t  t's  (denoted  tv  lover  case  letters)  are 

treated  as  atoms.  we  allow  composition  of  functions  (this  was 
a.rtdJi  used  u.  tne  adapted  axioms).  Let  o(  : A — >B  (Greek  letters 


represent  uotn  computaole  and  non-computable  IDs).  Let  oom(A) 
represent  tne  finite  sunset  realized  in  a  database  extension.  Let 


c\  (A)  represent  tne  set  {  cv(a) :  afdom(A)}.  For  b£dom(B), 
yA  -^(A)=b  is  tne  character istic  function  of  the  predicate  c>^(A)=b 
definea  over  dom(A),  i.e.,  /^*(A>  =b:  dom ( A) — > { 0 , 1 }  defined  by: 


/ 


/  (a)  = 

/  <*(A)=b 


h 


1  if  ( a) =b 

otherwise 


We  allow  taking  the  sum  of  a  set  hence  if  f : EMP — >DEPT  then  we  can 
define  a  computable  function  F,  F:DEPT — >NUMBER. OF. EMPS  by: 


F(d)  = 


( 


) 

f (EMP)=d 


for  d  6-  dom  (DEPT)  . 


aom(EMP) 

Thus  F  would  count  the  number  of  employees  in  a  department. 

Let  g :DEPT — >MGR,we  can  define  a  computable 


G:rtGR — >N UMBER. OF. ErtPS  by: 


;  *  u  u  ■  i.  i  . )  j  , 


r  •_  . 


L  l7. 


1  i-  j.  I  -  . 


JCiai  (uEPT; 


wnicti  computes  tne  number  of  employees  for  a  particular  manager. 

The  above  is  just  the  informal  embryo  of  a  language  but  it  is 
enough  for  our  own  purposes. 


Example  1 .  We  are  given  f^DEPT —  >MGR,  f2:EMP  —  >DEPT ,  FLOOR, 
F3:DEPT,  FLOOR — >NUMBER . OF . EMPS ,  and 

F4:MGR,  FLOOR — >N UMBER . OF . EMPS . 

F^  is  computable  by: 

F.(a,f)=  (X  ). 

tH-  f  2  (EMP)  =d  ,  f 
dom (EMP) 


t'4  is  computable  by: 


F4  * 


[  ( 


X 


3om(DEPT) 


tx  (DEPT ) =m 


)*F3(DEPT,f)]  . 


Using  A3(d)  on  f  x  :  DEPT — >MGR  and  F^ :  MGR ,  FLOOR  — >NUMBER .  OF.  EMPS  we 
derive 

G:DEPT, FLOOR — >NUMBER . OF . EMPS  where 


G(d,f)-F4(f1(d) ,f ) 


t( 


X 


dom (DEPT ) 


) *F, (DEPT , f ) ) . 
(DEPT )  *  f  ^  (df 


«*v* 


P  'i  n 


?  r. 


C.-^.riy  a:;  iccritri:  c  .  t:  1  *:  it  wr  :U<  :.  c->"2  '  i*  i;;<- 

whether  G  is  equivalent  to  I-.,.  Hence  there  are  two  derivations  of 

■S/ 

uiiKl’i  FLOOR — >N  jinbtiK  .  Or  .  tr‘.»  F-  with  oiiieient  user  intents.  Beer  i 
unu  bernstein's  solution  to  tms  violation  is  to  cnange  to: 

h  , :  NGK  ,  t  LOOR-->N  Ui'idlK  .  Ot  .  BmPo  .  ur  .  i-idk 
KfciiiarK.  If  we  dio  not  have  £  (as  is  the  case  in  the  Beeri  and 
bernstein  [BuJ  example,  1  ^  woula  revert  to  a  nun-computable  f^  but 
F  would  remain  computable.  G  would  be  computable  also  and  a 
violation  would  be  detected  because  there  were  two  derivations  of 
DEPT,  FLOOR — >NUMBER . OF . EMPS  one  regular  and  the  other  computable. 

The  above  discussion  contains  only  enough  detail  to  see  that 
such  a  program  is  feasible  and  that  some  of  the  database 
administrators  work  can  be  done  automatically.  We  shall  continue 
with  some  more  examples. 

Example  2.  We  shall  see  how  the  violation  caused  by  MGR  EMP 
could  be  discovered.  Let  f^EMP — >MGR  and  f  g : MGR< — >EMP.  By 
A^(b)  applied  to  fg  we  derive  g^:EMP< — >MGR  and  this  gives  two 
derivations  of  EMP — >MGR,  one  reyular  and  the  other  injective. 
Here  there  is  a  violation.  (We  could  have  used  transitivity  on 
f^,  fg  to  get  g^iEMP — >EMP  as  opposed  to  the  derivation  of 
EMP< — >EMP  by  Ajl  (a)  .) 

A  more  likely  example  would  be  EMP< — >SS#  and 
EMP— >ASST<— >SS# ,  giving  two  derivations  from  EMP  to  SSI,  one 
regular  and  one  injective. 
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t-Xciiupi  c  j.  uL  f :  bTOCh  >S'i  URE  and  f„:STJCK#,  5TOKL  --XJTi'. 

-  -  —  /  o 

The  "user  intent"  o f  f 7  is  to  map  the  STOCK#  onto  STORE  of  the 
store  that  is  in  charge  of  ordering  that  item  and  fg  maps  STOCK# 
and  STOKE  of  the  store  in  which  it  is  being  sold  into  the  quantity 
on  nand.  Using  Agfa)  we  derive  g^sSTOCK# — >0TY  .  Then  A?(a) 
gives  g^rSTOCK#,  STORE  — >QTY.  fg  and  g ^  represent  two  different 
intents  of  STOCK#,  STORE  — >QTY  both  classified  regular.  Hence 
tney  could  not  be  distinguished  by  the  classification  method.  The 
above  violation  is  corrected  by  splitting  STORE  and  with  the 
present  state  of  the  art  must  be  done  by  the  “users". 

7.  Concl us ion 

After  having  investigated  the  difficulties  arising  from  the 
uniqueness  assumption,  we  conclude  that  the  assumption  hampers  the 
ability  of  a  relational  database  to  represent  real  world 
situations.  In  particular  it  natural  to  allow  more  than  one 
dependency  between  the  same  entities.  On  the  other  hand  the 
universal  relation  assumption  is  needed  schema  design  algorithms 
and  other  inter  relational  concepts.  We  presented  a  solution  to 
the  problem  which  splits  attributes  when  neccessary  and  restricts 
the  use  of  FDs  when  violations  may  occur.  One  drawback  of  the 
above  solution  is  that  it  severely  shortens  the  derivation  paths 
for  EDs ,  increases  the  number  of  attribute  names.  As  a  result  the 
number  of  synthesized  relations  will,  in  general,  be  greater  and 
the  aatabase  will  be  more  cumbersome. 
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in  djuicion  trie  pro^u  oi  Jim  rent  pu  t-i-t  between  attridutes 
is  tnrust  up  to  the  query  language.  An  infinite  amount  of 
possible  attributes  may  iaaKe  it  difficult  for  the  designer  and 
user.  Tne  problem  of  differentiating  or  equating  between 
different  functional  paths  seems  central  (with  anv  database 
model).  Lastly  verification  of  the  assumptions  is  verv  time 
consuming  at  best  and  a  database  administrator  embarking  on  desion 
of  the  relational  schema  will  have  to  accept  them  as  a  matter  of 
(possibly  unjustified)  belief.  In  our  classification  of  FDs  we 
nave  taken  a  step  in  the  direction  of  automatically  detecting 
violations  of  the  uniqueness  assumption  (i.e.  differentiating 
between  paths) . 


\'***/?r*' 
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Tnrougn  private-  communication  with  Catriel  Beeri,  I  have 
aiscovered  that  he  lias  been  consider  i rig  the  problems  caused  by  the 
universal  relation  assumption.  In  particular  the  concepts  of 
compatible  and  potential  attributes  originated  with  Beeri. 


Fiige  ju 
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